{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "vanilla-progress",
   "metadata": {},
   "source": [
    "# 第四章 基础实战——FashionMNIST时装分类"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "metallic-oxford",
   "metadata": {},
   "source": [
    "<img src=\"./fashion-mnist-sprite.png\" width=\"400\" />  \n",
    "  \n",
    "经过前面三章内容的学习，我们完成了以下的内容：  \n",
    "- 对PyTorch有了初步的认识\n",
    "- 学会了如何安装PyTorch以及对应的编程环境\n",
    "- 学习了PyTorch最核心的理论基础（张量&自动求导）\n",
    "- 梳理了利用PyTorch完成深度学习的主要步骤和对应实现方式  \n",
    "  \n",
    "现在，我们通过一个基础实战案例，将第一部分所涉及的PyTorch入门知识串起来，便于大家加深理解。同时为后续的进阶学习打好基础。 \n",
    "  \n",
    "我们这里的任务是对10个类别的“时装”图像进行分类，使用FashionMNIST数据集（https://github.com/zalandoresearch/fashion-mnist ）。上图给出了FashionMNIST中数据的若干样例图，其中每个小图对应一个样本。  \n",
    "FashionMNIST数据集中包含已经预先划分好的训练集和测试集，其中训练集共60,000张图像，测试集共10,000张图像。每张图像均为单通道黑白图像，大小为32\\*32pixel，分属10个类别。  \n",
    "  \n",
    "下面让我们一起将第三章各部分内容逐步实现，来跑完整个深度学习流程。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "finished-scottish",
   "metadata": {},
   "source": [
    "**首先导入必要的包**  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "involved-milwaukee",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.optim as optim\n",
    "from torch.utils.data import Dataset, DataLoader"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "industrial-orange",
   "metadata": {},
   "source": [
    "**配置训练环境和超参数**  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "quarterly-hypothetical",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 配置GPU，这里有两种方式\n",
    "## 方案一：使用os.environ\n",
    "os.environ['CUDA_VISIBLE_DEVICES'] = '0'\n",
    "# 方案二：使用“device”，后续对要使用GPU的变量用.to(device)即可\n",
    "device = torch.device(\"cuda:1\" if torch.cuda.is_available() else \"cpu\")\n",
    "\n",
    "## 配置其他超参数，如batch_size, num_workers, learning rate, 以及总的epochs\n",
    "batch_size = 256\n",
    "num_workers = 4\n",
    "lr = 1e-4\n",
    "epochs = 20"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "apparent-reception",
   "metadata": {},
   "source": [
    "**数据读入和加载**  \n",
    "这里同时展示两种方式:  \n",
    "- 下载并使用PyTorch提供的内置数据集  \n",
    "- 从网站下载以csv格式存储的数据，读入并转成预期的格式    \n",
    "第一种数据读入方式只适用于常见的数据集，如MNIST，CIFAR10等，PyTorch官方提供了数据下载。这种方式往往适用于快速测试方法（比如测试下某个idea在MNIST数据集上是否有效）  \n",
    "第二种数据读入方式需要自己构建Dataset，这对于PyTorch应用于自己的工作中十分重要  \n",
    "  \n",
    "同时，还需要对数据进行必要的变换，比如说需要将图片统一为一致的大小，以便后续能够输入网络训练；需要将数据格式转为Tensor类，等等。\n",
    "  \n",
    "这些变换可以很方便地借助torchvision包来完成，这是PyTorch官方用于图像处理的工具库，上面提到的使用内置数据集的方式也要用到。PyTorch的一大方便之处就在于它是一整套“生态”，有着官方和第三方各个领域的支持。这些内容我们会在后续课程中详细介绍。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "frank-punishment",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 首先设置数据变换\n",
    "from torchvision import transforms\n",
    "\n",
    "image_size = 28\n",
    "data_transform = transforms.Compose([\n",
    "    transforms.ToPILImage(),   # 这一步取决于后续的数据读取方式，如果使用内置数据集则不需要\n",
    "    transforms.Resize(image_size),\n",
    "    transforms.ToTensor()\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "canadian-temple",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448234945/work/torch/csrc/utils/tensor_numpy.cpp:180.)\n",
      "  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)\n"
     ]
    }
   ],
   "source": [
    "## 读取方式一：使用torchvision自带数据集，下载可能需要一段时间\n",
    "from torchvision import datasets\n",
    "\n",
    "train_data = datasets.FashionMNIST(root='./', train=True, download=True, transform=data_transform)\n",
    "test_data = datasets.FashionMNIST(root='./', train=False, download=True, transform=data_transform)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "loose-enterprise",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "## 读取方式二：读入csv格式的数据，自行构建Dataset类\n",
    "class FMDataset(Dataset):\n",
    "    def __init__(self, df, transform=None):\n",
    "        self.df = df\n",
    "        self.transform = transform\n",
    "        self.images = df.iloc[:,1:].values.astype(np.uint8)\n",
    "        self.labels = df.iloc[:, 0].values\n",
    "        \n",
    "    def __len__(self):\n",
    "        return len(self.images)\n",
    "    \n",
    "    def __getitem__(self, idx):\n",
    "        image = self.images[idx].reshape(28,28,1)\n",
    "        label = int(self.labels[idx])\n",
    "        if self.transform is not None:\n",
    "            image = self.transform(image)\n",
    "        else:\n",
    "            image = torch.tensor(image/255., dtype=torch.float)\n",
    "        label = torch.tensor(label, dtype=torch.long)\n",
    "        return image, label\n",
    "\n",
    "train_df = pd.read_csv(\"./FashionMNIST/fashion-mnist_train.csv\")\n",
    "test_df = pd.read_csv(\"./FashionMNIST/fashion-mnist_test.csv\")\n",
    "train_data = FMDataset(train_df, data_transform)\n",
    "test_data = FMDataset(test_df, data_transform)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "handmade-samba",
   "metadata": {},
   "source": [
    "在构建训练和测试数据集完成后，需要定义DataLoader类，以便在训练和测试时加载数据  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "mobile-training",
   "metadata": {},
   "outputs": [],
   "source": [
    "train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=num_workers, drop_last=True)\n",
    "test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False, num_workers=num_workers)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "elder-hardware",
   "metadata": {},
   "source": [
    "读入后，我们可以做一些数据可视化操作，主要是验证我们读入的数据是否正确"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "accurate-butterfly",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([256, 1, 28, 28]) torch.Size([256])\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<matplotlib.image.AxesImage at 0x7f19a043cc10>"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAARN0lEQVR4nO3dW4xVVZ7H8d9fhHATBJWSqzReAB2jjogmmAnGaJBo0BdtHiaatJbxktiJMSg+tA+adMy0ZhKjSRGJOOmRdNLdI4kt044anYkJCAaVyyA3CVSoAuV+EQT+81AHp9Ta/1WefW66vp+kUlX7X+ucxan6sc85a6+1zN0F4JfvrGZ3AEBjEHYgE4QdyARhBzJB2IFMnN3IOzMz3voH6szdra/jpc7sZjbHzDaa2WYze7LMbQGoL6t2nN3MBkj6QtItknZK+ljSfHdfH7ThzA7UWT3O7DMlbXb3re5+QtJSSfNK3B6AOioT9vGSdvT6fmfl2PeYWbuZrTKzVSXuC0BJdX+Dzt07JHVIPI0HmqnMmb1T0sRe30+oHAPQgsqE/WNJl5rZr8xskKRfS1pWm24BqLWqn8a7+0kze1TSf0oaIGmxu6+rWc8A1FTVQ29V3Rmv2YG6q8tFNQB+Pgg7kAnCDmSCsAOZIOxAJgg7kImGzmdH6znrrPj/+9OnTzeoJz/dlClTwvqVV15ZWJs1a1bYtru7O6zv2LEjrF988cVhfciQIYW1119/PWy7efPmsF6EMzuQCcIOZIKwA5kg7EAmCDuQCcIOZIKhN9TVHXfcUVh74IEHwrZXXHFFWD916lRY37p1a2Ft27ZtYdstW7aE9XvuuSesjxo1Kqzv27evsLZz586wbbU4swOZIOxAJgg7kAnCDmSCsAOZIOxAJgg7kAnG2X/hyk5hPfvs+E9k4cKFYX3EiBGFtRUrVoRtn3322bC+cuXKsF5Pq1evDuvt7e1Vt+/q6grbzp49u7C2cePGwhpndiAThB3IBGEHMkHYgUwQdiAThB3IBGEHMsE4+y9c2XH2aD66JE2dOjWsP/fcc4W19evXh21b2apVq8L6oEGDwvpHH31UWEstkd3W1lZYi+bwlwq7mX0p6ZCkU5JOuvuMMrcHoH5qcWa/yd2/qsHtAKgjXrMDmSgbdpf0dzNbbWZ9XgxsZu1mtsrM4hc5AOqq7NP4G92908zGSHrHzP7X3T/s/QPu3iGpQ5LMzEveH4AqlTqzu3tn5fNuSX+VNLMWnQJQe1WH3cyGmdk5Z76WdKuktbXqGIDaMvfqnlmb2RT1nM2lnpcD/+7uxYOq4ml8M9R7PvvLL78c1qOtiQcPHhy2HTNmTFhfvnx5WH/vvfcKa6m59CNHjgzrb7/9dlifN29eWN+zZ09YL8Pdra/jVb9md/etkq6qukcAGoqhNyAThB3IBGEHMkHYgUwQdiATVQ+9VXVnDL1lZ/LkyYW1u+66K2x70003hfVrrrkmrE+YMKGwlppe++mnn4b1yy67LKzff//9YX3NmjVhvYyioTfO7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIJx9l84sz6HXGumkX8/P9WkSZOqqknSggULwnpqem5qavHSpUsLa4sWLQrbRr9Td2ecHcgdYQcyQdiBTBB2IBOEHcgEYQcyQdiBTLBl8y9cK4+D19vBgwcLaydOnCh128OGDQvrX3/9dVg/efJkqfuvBmd2IBOEHcgEYQcyQdiBTBB2IBOEHcgEYQcywTg7SpkyZUpYv++++wprc+fODdu2tbWF9dR209E1BqtXrw7bHj9+PKwfOHAgrEdj/JK0YcOGsB6p9tqJ5JndzBab2W4zW9vr2Ggze8fMNlU+j6rq3gE0TH+exr8mac4Pjj0p6V13v1TSu5XvAbSwZNjd/UNJe39weJ6kJZWvl0i6s7bdAlBr1b5mb3P3XZWvuyQVvrgys3ZJ7VXeD4AaKf0Gnbt7tJCku3dI6pBYcBJopmqH3rrNbKwkVT7vrl2XANRDtWFfJuneytf3SnqzNt0BUC/JdePN7A1JsyWdL6lb0u8k/YekP0maJGm7pLvd/Ydv4vV1WzyNb7DUuvFl57u/8MILYb27u7uwtnz58rDtpk2bwvrRo0fDehmpdeOnT58e1vfv3x/WN27cWFh75ZVXwrYpRevGJ1+zu/v8gtLNpXoEoKG4XBbIBGEHMkHYgUwQdiAThB3IBFNcUcrAgQPDerRkcldXV9i2nkNrI0eODOuXX355WO/s7AzrqaWmJ0+eHNbrgTM7kAnCDmSCsAOZIOxAJgg7kAnCDmSCsAOZSE5xremdMcW15ZSdAjt06NCw/tZbbxXW9u3bF7Y9duxYWN++fXtY37NnT2Ft4sSJYdvUlspnnRWfJ88555ywHj1ujz/+eNh29+54rZiiKa6c2YFMEHYgE4QdyARhBzJB2IFMEHYgE4QdyATz2X8GUmPhkdQ4eao+YMCAsJ6ac/7UU08V1m677bawbUpqTvipU6cKa9u2bQvbDhkypJoufSd1jUD0uKbG+KvFmR3IBGEHMkHYgUwQdiAThB3IBGEHMkHYgUwwzv4z0I9ttet236dPny7VfuXKlYW1W2+9NWz77bffhvWdO3eG9QMHDhTWUuvGjxgxIqynxtFT1x+MGzeusJZaI2Dv3uTu6H1KntnNbLGZ7Taztb2OPWNmnWa2pvIxt6p7B9Aw/Xka/5qkOX0cf9Hdr658/K223QJQa8mwu/uHkqp73gCgZZR5g+5RM/us8jR/VNEPmVm7ma0ys1Ul7gtASdWG/RVJF0u6WtIuSX8o+kF373D3Ge4+o8r7AlADVYXd3bvd/ZS7n5a0SNLM2nYLQK1VFXYzG9vr27skrS36WQCtITnObmZvSJot6Xwz2ynpd5Jmm9nVklzSl5IerF8XayO1zndqPDmaf5xqW3YcPHX7jVz7/6eK+v7NN9+EbVNru6fG2SdNmlRYi+a6S9Lhw4fDempd+La2trB+wQUXFNZS/65qJcPu7vP7OPxqHfoCoI64XBbIBGEHMkHYgUwQdiAThB3IRMO3bI6GoerZl9SSyKmhmGYqu61yGfV83C655JKw/sQTT4T11FLSnZ2dhbXUUGxq2G/Hjh1hPTUFNhqSfOSRR8K2KWzZDGSOsAOZIOxAJgg7kAnCDmSCsAOZIOxAJhq+lHSzpmOmxoNT465jx44trN1yyy1V9emM1157LayXWUo61XbgwIFhPbWcc8rMmcXrmjz22GNh22uvvTasL168OKw///zzhbUXX3wxbJu6vmD79u1hfdSowpXaJKWn99YDZ3YgE4QdyARhBzJB2IFMEHYgE4QdyARhBzLRUls2l5m3ndqC9/rrrw/rqaWBo3H49evXh22nTZsW1ufP72sB3/+3dOnSsF7m2oXUOPrZZ8d/Iqm+7d+/v7C2YsWKsO1DDz0U1g8ePBjWI6ktmaN+S+m/1dTtM84OoG4IO5AJwg5kgrADmSDsQCYIO5AJwg5koqXG2cuMF0+dOjWsDxs2LKyn1gE/cOBAYS3V7w8++CCsT58+PazPmTMnrL///vuFtdR47qxZs8L6ww8/HNY3btwY1p9++umw3iypcfCvvvoqrKeuy0itE5Dahrsekmd2M5toZu+b2XozW2dmj1WOjzazd8xsU+VzPFsfQFP152n8SUmPu/vlkm6Q9IiZXS7pSUnvuvulkt6tfA+gRSXD7u673P2TyteHJG2QNF7SPElLKj+2RNKddeojgBr4Sa/ZzWyypGskrZDU5u67KqUuSW0FbdoltZfoI4Aa6Pe78WY2XNKfJf3W3b83A8F73qHq810qd+9w9xnuPqNUTwGU0q+wm9lA9QT9j+7+l8rhbjMbW6mPlbS7Pl0EUAvJp/HWM5fvVUkb3P2FXqVlku6V9PvK5zf7cVsaNGhQYf3BBx9M3UShL774IqwfPXo0rKeG5saMGVNYGzp0aKnbTk3Pveiii8L6vn37Cms333xz2PaGG24I6wsWLAjrqem9kdQ00VQ9NXw1evTowlpq6Gzw4MFhfciQIWE91bcTJ06E9Xroz2v2WZL+WdLnZramcmyhekL+JzP7jaTtku6uSw8B1EQy7O7+P5KK/ouNTxsAWgaXywKZIOxAJgg7kAnCDmSCsAOZaOgU1+HDh4dLOo8bNy5sH41NpsY1U1vopkRjvkeOHAnbppY87u7uDuup7YGvu+66qmpSehnrw4cPh/XUVNHo315mK+r+uPDCCwtre/fuDdumxtFTfU9d19GMrcs5swOZIOxAJgg7kAnCDmSCsAOZIOxAJgg7kImGjrOfPn1ahw4dKqyPHz8+bB+Ns6fmJ0dzvqX01sXROH5qrPncc88N66m5zdFcekm66qqrCmupbY9T4+ipLZuj32dZ0TbZUvraigkTJhTWhg8fHrYtO988dY1AaqnpeuDMDmSCsAOZIOxAJgg7kAnCDmSCsAOZIOxAJho6zn7kyBGtWLGisJ4ar7799tsLa9OmTQvb7t+/P6yfPHkyrEdj5eedd17Y9vjx42G9q6srrE+aNCmsL1q0qLC2bt26sG1K6nGpp7JzvqO/p2PHjoVtU2P4qXH01O8stc9BtfcdPWac2YFMEHYgE4QdyARhBzJB2IFMEHYgE4QdyIT1Y+3uiZJel9QmySV1uPu/mtkzkh6QtKfyowvd/W+J26rbYtmpPdCnTp0a1lN7oEfj+Kn17lNjtqn6li1bwvpLL70U1n+uUmPZZcbhU/sIpNY3SM3zT/1OU3sJlOHufT5w/bmo5qSkx939EzM7R9JqM3unUnvR3f+lVp0EUD/92Z99l6Rdla8PmdkGSfGSMgBazk96zW5mkyVdI+nMNa+PmtlnZrbYzPp8XmRm7Wa2ysxWlesqgDL6HXYzGy7pz5J+6+4HJb0i6WJJV6vnzP+Hvtq5e4e7z3D3GeW7C6Ba/Qq7mQ1UT9D/6O5/kSR373b3U+5+WtIiSTPr100AZSXDbj1vib4qaYO7v9Dr+NheP3aXpLW17x6AWunP0NuNkv5b0ueSzownLJQ0Xz1P4V3Sl5IerLyZF91W4/epBTJTNPSWDHstEXag/orCzhV0QCYIO5AJwg5kgrADmSDsQCYIO5AJwg5kgrADmSDsQCYIO5AJwg5kgrADmSDsQCYIO5CJhm7ZLOkrSdt7fX9+5VgratW+tWq/JPpWrVr2rXBN9IbOZ//RnZutatW16Vq1b63aL4m+VatRfeNpPJAJwg5kotlh72jy/UdatW+t2i+JvlWrIX1r6mt2AI3T7DM7gAYh7EAmmhJ2M5tjZhvNbLOZPdmMPhQxsy/N7HMzW9Ps/ekqe+jtNrO1vY6NNrN3zGxT5XO893Bj+/aMmXVWHrs1Zja3SX2baGbvm9l6M1tnZo9Vjjf1sQv61ZDHreGv2c1sgKQvJN0iaaekjyXNd/f1De1IATP7UtIMd2/6BRhm9k+SDkt63d3/oXLseUl73f33lf8oR7n7ghbp2zOSDjd7G+/KbkVje28zLulOSfepiY9d0K+71YDHrRln9pmSNrv7Vnc/IWmppHlN6EfLc/cPJe39weF5kpZUvl6inj+WhivoW0tw913u/knl60OSzmwz3tTHLuhXQzQj7OMl7ej1/U611n7vLunvZrbazNqb3Zk+tPXaZqtLUlszO9OH5DbejfSDbcZb5rGrZvvzsniD7sdudPd/lHSbpEcqT1dbkve8BmulsdN+bePdKH1sM/6dZj521W5/XlYzwt4paWKv7ydUjrUEd++sfN4t6a9qva2ou8/soFv5vLvJ/flOK23j3dc242qBx66Z2583I+wfS7rUzH5lZoMk/VrSsib040fMbFjljROZ2TBJt6r1tqJeJuneytf3SnqziX35nlbZxrtom3E1+bFr+vbn7t7wD0lz1fOO/BZJTzejDwX9miLp08rHumb3TdIb6nla96163tv4jaTzJL0raZOk/5I0uoX69m/q2dr7M/UEa2yT+najep6ifyZpTeVjbrMfu6BfDXncuFwWyARv0AGZIOxAJgg7kAnCDmSCsAOZIOxAJgg7kIn/A7G6wX7EeH83AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "image, label = next(iter(train_loader))\n",
    "print(image.shape, label.shape)\n",
    "plt.imshow(image[0][0], cmap=\"gray\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "executive-graduate",
   "metadata": {},
   "source": [
    "**模型设计**  \n",
    "由于任务较为简单，这里我们手搭一个CNN，而不考虑当下各种模型的复杂结构  \n",
    "模型构建完成后，将模型放到GPU上用于训练  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "powered-mount",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Net(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(Net, self).__init__()\n",
    "        self.conv = nn.Sequential(\n",
    "            nn.Conv2d(1, 32, 5),\n",
    "            nn.ReLU(),\n",
    "            nn.MaxPool2d(2, stride=2),\n",
    "            nn.Dropout(0.3),\n",
    "            nn.Conv2d(32, 64, 5),\n",
    "            nn.ReLU(),\n",
    "            nn.MaxPool2d(2, stride=2),\n",
    "            nn.Dropout(0.3)\n",
    "        )\n",
    "        self.fc = nn.Sequential(\n",
    "            nn.Linear(64*4*4, 512),\n",
    "            nn.ReLU(),\n",
    "            nn.Linear(512, 10)\n",
    "        )\n",
    "        \n",
    "    def forward(self, x):\n",
    "        x = self.conv(x)\n",
    "        x = x.view(-1, 64*4*4)\n",
    "        x = self.fc(x)\n",
    "        # x = nn.functional.normalize(x)\n",
    "        return x\n",
    "\n",
    "model = Net()\n",
    "model = model.cuda()\n",
    "# model = nn.DataParallel(model).cuda()   # 多卡训练时的写法，之后的课程中会进一步讲解"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "sporting-spouse",
   "metadata": {},
   "source": [
    "**设定损失函数**  \n",
    "使用torch.nn模块自带的CrossEntropy损失  \n",
    "PyTorch会自动把整数型的label转为one-hot型，用于计算CE loss  \n",
    "这里需要确保label是从0开始的，同时模型不加softmax层（使用logits计算）,这也说明了PyTorch训练中各个部分不是独立的，需要通盘考虑"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "front-translation",
   "metadata": {},
   "outputs": [],
   "source": [
    "criterion = nn.CrossEntropyLoss()\n",
    "# criterion = nn.CrossEntropyLoss(weight=[1,1,1,1,3,1,1,1,1,1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "small-thermal",
   "metadata": {},
   "outputs": [],
   "source": [
    "?nn.CrossEntropyLoss # 这里方便看一下weighting等策略"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "reserved-timber",
   "metadata": {},
   "source": [
    "**设定优化器**  \n",
    "这里我们使用Adam优化器  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "vanilla-detail",
   "metadata": {},
   "outputs": [],
   "source": [
    "optimizer = optim.Adam(model.parameters(), lr=0.001)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "rental-desperate",
   "metadata": {},
   "source": [
    "**训练和测试（验证）**  \n",
    "各自封装成函数，方便后续调用  \n",
    "关注两者的主要区别：  \n",
    "- 模型状态设置  \n",
    "- 是否需要初始化优化器\n",
    "- 是否需要将loss传回到网络\n",
    "- 是否需要每步更新optimizer  \n",
    "  \n",
    "此外，对于测试或验证过程，可以计算分类准确率"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "clean-deposit",
   "metadata": {},
   "outputs": [],
   "source": [
    "def train(epoch):\n",
    "    model.train()\n",
    "    train_loss = 0\n",
    "    for data, label in train_loader:\n",
    "        data, label = data.cuda(), label.cuda()\n",
    "        optimizer.zero_grad()\n",
    "        output = model(data)\n",
    "        loss = criterion(output, label)\n",
    "        loss.backward()\n",
    "        optimizer.step()\n",
    "        train_loss += loss.item()*data.size(0)\n",
    "    train_loss = train_loss/len(train_loader.dataset)\n",
    "    print('Epoch: {} \\tTraining Loss: {:.6f}'.format(epoch, train_loss))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "ordered-doubt",
   "metadata": {},
   "outputs": [],
   "source": [
    "def val(epoch):       \n",
    "    model.eval()\n",
    "    val_loss = 0\n",
    "    gt_labels = []\n",
    "    pred_labels = []\n",
    "    with torch.no_grad():\n",
    "        for data, label in test_loader:\n",
    "            data, label = data.cuda(), label.cuda()\n",
    "            output = model(data)\n",
    "            preds = torch.argmax(output, 1)\n",
    "            gt_labels.append(label.cpu().data.numpy())\n",
    "            pred_labels.append(preds.cpu().data.numpy())\n",
    "            loss = criterion(output, label)\n",
    "            val_loss += loss.item()*data.size(0)\n",
    "    val_loss = val_loss/len(test_loader.dataset)\n",
    "    gt_labels, pred_labels = np.concatenate(gt_labels), np.concatenate(pred_labels)\n",
    "    acc = np.sum(gt_labels==pred_labels)/len(pred_labels)\n",
    "    print('Epoch: {} \\tValidation Loss: {:.6f}, Accuracy: {:6f}'.format(epoch, val_loss, acc))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "agreed-threat",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448234945/work/c10/core/TensorImpl.h:1156.)\n",
      "  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 1 \tTraining Loss: 0.659050\n",
      "Epoch: 1 \tValidation Loss: 0.420328, Accuracy: 0.852000\n",
      "Epoch: 2 \tTraining Loss: 0.403703\n",
      "Epoch: 2 \tValidation Loss: 0.350373, Accuracy: 0.872300\n",
      "Epoch: 3 \tTraining Loss: 0.350197\n",
      "Epoch: 3 \tValidation Loss: 0.293053, Accuracy: 0.893200\n",
      "Epoch: 4 \tTraining Loss: 0.322463\n",
      "Epoch: 4 \tValidation Loss: 0.283335, Accuracy: 0.892300\n",
      "Epoch: 5 \tTraining Loss: 0.300117\n",
      "Epoch: 5 \tValidation Loss: 0.268653, Accuracy: 0.903500\n",
      "Epoch: 6 \tTraining Loss: 0.282179\n",
      "Epoch: 6 \tValidation Loss: 0.247219, Accuracy: 0.907200\n",
      "Epoch: 7 \tTraining Loss: 0.268283\n",
      "Epoch: 7 \tValidation Loss: 0.242937, Accuracy: 0.907800\n",
      "Epoch: 8 \tTraining Loss: 0.257615\n",
      "Epoch: 8 \tValidation Loss: 0.234324, Accuracy: 0.912200\n",
      "Epoch: 9 \tTraining Loss: 0.245795\n",
      "Epoch: 9 \tValidation Loss: 0.231515, Accuracy: 0.914100\n",
      "Epoch: 10 \tTraining Loss: 0.238739\n",
      "Epoch: 10 \tValidation Loss: 0.229616, Accuracy: 0.914400\n",
      "Epoch: 11 \tTraining Loss: 0.230499\n",
      "Epoch: 11 \tValidation Loss: 0.228124, Accuracy: 0.915200\n",
      "Epoch: 12 \tTraining Loss: 0.221574\n",
      "Epoch: 12 \tValidation Loss: 0.211928, Accuracy: 0.921200\n",
      "Epoch: 13 \tTraining Loss: 0.217924\n",
      "Epoch: 13 \tValidation Loss: 0.209744, Accuracy: 0.921700\n",
      "Epoch: 14 \tTraining Loss: 0.206033\n",
      "Epoch: 14 \tValidation Loss: 0.215477, Accuracy: 0.921400\n",
      "Epoch: 15 \tTraining Loss: 0.203349\n",
      "Epoch: 15 \tValidation Loss: 0.215550, Accuracy: 0.919400\n",
      "Epoch: 16 \tTraining Loss: 0.196319\n",
      "Epoch: 16 \tValidation Loss: 0.210800, Accuracy: 0.923700\n",
      "Epoch: 17 \tTraining Loss: 0.191969\n",
      "Epoch: 17 \tValidation Loss: 0.207266, Accuracy: 0.923700\n",
      "Epoch: 18 \tTraining Loss: 0.185466\n",
      "Epoch: 18 \tValidation Loss: 0.207138, Accuracy: 0.924200\n",
      "Epoch: 19 \tTraining Loss: 0.178241\n",
      "Epoch: 19 \tValidation Loss: 0.204093, Accuracy: 0.924900\n",
      "Epoch: 20 \tTraining Loss: 0.176674\n",
      "Epoch: 20 \tValidation Loss: 0.197495, Accuracy: 0.928300\n"
     ]
    }
   ],
   "source": [
    "for epoch in range(1, epochs+1):\n",
    "    train(epoch)\n",
    "    val(epoch)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "confident-vermont",
   "metadata": {},
   "source": [
    "**模型保存**  \n",
    "训练完成后，可以使用torch.save保存模型参数或者整个模型，也可以在训练过程中保存模型  \n",
    "这部分会在后面的课程中详细介绍"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "comic-phone",
   "metadata": {},
   "outputs": [],
   "source": [
    "save_path = \"./FahionModel.pkl\"\n",
    "torch.save(model, save_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "advisory-short",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:smp]",
   "language": "python",
   "name": "lijiaqi"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
